Search results for "Global Namespace"

showing 1 items of 1 documents

Streamlining distributed Deep Learning I/O with ad hoc file systems

2021

With evolving techniques to parallelize Deep Learning (DL) and the growing amount of training data and model complexity, High-Performance Computing (HPC) has become increasingly important for machine learning engineers. Although many compute clusters already use learning accelerators or GPUs, HPC storage systems are not suitable for the I/O requirements of DL workflows. Therefore, users typically copy the whole training data to the worker nodes or distribute partitions. Because DL depends on randomized input data, prior work stated that partitioning impacts DL accuracy. Their solutions focused mainly on training I/O performance on a high-speed network but did not cover the data stage-in pro…

Data setWorkflowDistributed databaseProcess (engineering)Computer sciencebusiness.industryDeep learningDistributed computingComputer data storageData deduplicationArtificial intelligenceGlobal Namespacebusiness2021 IEEE International Conference on Cluster Computing (CLUSTER)
researchProduct